Nuisance captures of black bears in Florida

While Florida may be well known for the rambunctious activities of alligators, I chose to better understand the types of bears being captured for nuisance reasons. Nuisance bears are not injured or captured for research. They are simply running around and causing trouble, as bears sometimes do. There could be a great story from this data: an exploration of different bears, highlighting a juvenile bear being relocated from rummaging through trash, or looking into what kind of bears are recaptures. I really love that this data included the whole “bear capture timeline” from capture date, capture method, and capture reason, to its final action, whether that be rehab, release, or other.

raw <- read_sf("../data/blackbears/FL_Black_Bear_Capture_Locations.shp")
raw
## Simple feature collection with 1825 features and 33 fields (with 2 geometries empty)
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -87.4664 ymin: 25.57503 xmax: -80.25332 ymax: 30.98165
## Geodetic CRS:  WGS 84
## # A tibble: 1,825 × 34
##    OBJECTID IncidentID    ReportDate CapDate    Region BMU   County Locale City 
##       <int> <chr>         <date>     <date>     <chr>  <chr> <chr>  <chr>  <chr>
##  1     8757 Bear-0000016… 1900-01-01 2014-07-25 North… Cent… Putnam <NA>   <NA> 
##  2     8758 Bear-0000016… 1900-01-01 2013-07-25 North… Cent… Putnam <NA>   <NA> 
##  3     8759 Bear-0000009… 1981-10-15 1981-10-15 North… Cent… Semin… <NA>   SANF…
##  4     8760 Bear-0000014… 1982-05-28 NA         North… Cent… Lake   OTHER  OCAL…
##  5     8761 Bear-0000014… 1985-01-01 NA         North… West… Okalo… OTHER  <NA> 
##  6     8762 Bear-0000006… 1986-05-13 1986-05-14 North… North Baker  OTHER  SEBR…
##  7     8763 Bear-0000014… 1986-06-10 NA         North… West… Okalo… OTHER  <NA> 
##  8     8764 Bear-0000009… 1986-08-02 1986-08-02 North… Cent… Clay   OTHER  MIDD…
##  9     8765 Bear-0000009… 1986-09-10 1986-09-10 North… North Baker  OTHER  GLEN…
## 10     8766 Bear-0000014… 1987-07-30 1987-07-30 South  South Colli… <NA>   <NA> 
## # … with 1,815 more rows, and 25 more variables: ZipCode <chr>, Latitude <dbl>,
## #   Longitude <dbl>, CapReason <chr>, OCapReason <chr>, NCapReason <chr>,
## #   CapMethod <chr>, OrigCap <chr>, FinAction <chr>, OFinAction <chr>,
## #   Sex <chr>, AgeClass <chr>, Weight <dbl>, Age <dbl>, Cubs <chr>,
## #   Yearlings <chr>, LEarTag1 <chr>, LEarTag2 <chr>, REarTag1 <chr>,
## #   REarTag2 <chr>, Tattoo1 <chr>, Tattoo2 <chr>, TrackNum <chr>,
## #   last_edite <date>, geometry <POINT [°]>
fl <- map_data("state", "florida")

ncap <- raw %>% 
  drop_na(NCapReason)
nbears_plot <- ggplot() +
  geom_polygon(fl, mapping = aes(long, lat), fill = NA, colour = "grey60") +
  geom_sf(st_as_sf(ncap), mapping = aes(color = NCapReason), size = 1.7, alpha = 0.5) +
  theme_classic() +
  labs(
    x = "",
    y = "",
    title = "Capture of nuisance black bears in Florida",
    caption = "Source: Florida Geospatial Open Data Portal",
    color = "Reasons for capture")
nbears_plot

The shapefile gave us GIS data to locate where each bear was being captured, with colors to highlight the kind of capture occurring. I chose to show only the outline of Florida for the cleanest possible look. However, another iteration may include a light outline of counties to help contextualize capture location, while not getting in the way of readability. I find this visualization sparks me no joy, so I decided to reuse many core elements, make it interactive, and give more context to liven it up a little.

nbears_more_info <- ggplot() +
  geom_polygon(fl, mapping = aes(long, lat), fill = NA, colour = "grey60") +
  geom_sf(st_as_sf(ncap), mapping = aes(color = NCapReason, text = paste("Method:", CapMethod, "\nAge:", AgeClass)), size = 1.7, alpha = 0.5) +
  theme_classic() +
  labs(
    x = "",
    y = "",
    title = "Capture of nuisance black bears in Florida",
    color = "Reasons for capture")

interact <- ggplotly(nbears_more_info)
interact
htmlwidgets::saveWidget(interact, "nuisance_bears.html")

Interactive plots are so powerful. Hovering over each data point gives us a story of a bear: the reason for capture, the method of capture, and the age of the bear. This plot also allows us to show/hide each category of nuisance capture, and hone in what we find as important. This interactivity also gives us some feature knowledge that would be otherwise difficult to glean; for example, two categories are very similar– “Attacked Animal” and “Threatened Animal.” From the visualization, we can easily see that these two categories are rare, and it might be useful to group them for modeling purposes. One thing I found pretty annoying was the fact that ggplotly() drops my caption AND subtitle lab! I ended up removing them altogether since they did not render.

set.seed(1223)
model_data <- raw %>% 
  select(Age, Weight, NCapReason, BMU) %>% 
  mutate_at(vars(NCapReason, BMU), factor)

bear_split <- initial_split(model_data, prop = 0.85)
train <- training(bear_split)
test <- testing(bear_split)
bear_model <-
  workflow(Age ~ Weight + NCapReason + BMU, linear_reg()) %>%
  step_indicate_na(Weight, NCapReason, BMU) %>%
  fit(train)

tidy(bear_model)
## # A tibble: 15 × 5
##    term                          estimate std.error statistic   p.value
##    <chr>                            <dbl>     <dbl>     <dbl>     <dbl>
##  1 (Intercept)                   -1.02      1.88      -0.539   0.590   
##  2 Weight                         0.00496   0.00127    3.91    0.000130
##  3 `NCapReasonIn Apiary`          1.39      1.03       1.35    0.178   
##  4 `NCapReasonIn Feed/Feeder`    -0.448     1.26      -0.354   0.723   
##  5 `NCapReasonIn Garbage`         0.204     0.982      0.208   0.835   
##  6 `NCapReasonKilled Animal`     -0.544     1.32      -0.411   0.682   
##  7 NCapReasonOTHER               -0.533     1.16      -0.459   0.647   
##  8 `NCapReasonProperty Damage`    0.0236    0.981      0.0240  0.981   
##  9 `NCapReasonThreatened Animal` -0.919     2.47      -0.372   0.710   
## 10 BMUCentral                     1.07      1.64       0.650   0.516   
## 11 `BMUEast Panhandle`            1.64      1.65       0.997   0.320   
## 12 BMUNorth                      NA        NA         NA      NA       
## 13 BMUSouth                       0.243     1.68       0.145   0.885   
## 14 `BMUSouth Central`             3.27      1.83       1.79    0.0757  
## 15 `BMUWest Panhandle`            0.969     1.89       0.514   0.608

For this model, I used a linear model to find out the relationship bewtween the age of captured bears and their weight, nuisance capture reason, and location.

pred <- predict(bear_model, new_data = test)
conf_int_pred <- predict(bear_model, new_data = test, type = "conf_int")

test %>% 
  bind_cols(pred) %>% 
  bind_cols(conf_int_pred) %>%
  ggplot(aes(x = NCapReason)) +
  geom_errorbar(aes(ymin = 0, 
                    ymax = .pred_upper),
                alpha = 0.5,
                width = .2) +
  geom_point(aes(y = .pred)) +
  theme_minimal() +
  labs(
    x = "Nuisance capture reason",
    y = "Predicted age"
  )

This plot is horribly ugly, but it’s being presented because I found it entertaining that (using some hand-wavy over generalization), young bears are found in the garbage or damaging property, where older bears are found in apiaries (where bees are kept). Those teenage bears are out there getting into trouble while the older bears have finer taste. This plot is also useful since it shows our model can predict a negative age. This is due to model choice, since linear models have no understanding of zero as an absolute minimum. It might be useful to look into a different model that is a better fit for the outcome variable.

bear_coefs <- tidy(bear_model, conf.int = TRUE) %>% 
  filter(term != "(Intercept)")

ggplot(bear_coefs, aes(x = estimate,
                       y = fct_rev(term))) +
  geom_pointrange(aes(xmin = conf.low, 
                      xmax = conf.high)) +
  geom_vline(xintercept = 0) + 
  theme_minimal() +
  labs (
    title = "Coefficient plot",
    subtitle =  "Linear model for estimating nuisance bear age",
    y = ""
  )

The strongest positive effects here are from location (particularly the South Central or East Panhandle area) and being in the apiary (again, I really do love that for them). The strongest negative effects are from threatening an animal or killing an animal. However, these coefficients are a great place to begin exploration. One coefficient that was surprising was that weight had no effect on age of bear. This seems wrong, since bears certainly get heavier as they age, at least between ages 0-1. To be clear, these are the effects that these coefficients had on a linear model and not a holistic view on the age distribution of nuisance bears.